November 14, 2017
The following 5-step outline is a useful way to organize the ideas of hypothesis testing:
Later, we'll be learning how to compare means of two groups. Our null hypothesis is that they are equal:
\[ H_{0}: \mu_{1} = \mu_{2} \]
Alternatively, we may have a research “hunch” that one is larger than the other, resulting in a one-tailed alternative:
\[ H_{a}: \mu_{1} > \mu_{2} \quad or \quad H_{a}: \mu_{1} < \mu_{2} \]
If the research hypothesis is not clearly in one direction only, we conduct a two-tailed test:
\[ H_{a}: \mu_{1} \neq \mu_{2} \]
The advantage of a one-tailed test is that sometimes it is intuitive to only look in one direction. Also, this makes it easier to reject \(H_{0}\).
Two-tailed tests have several advantages:
Recall our standard error of the mean formula:
\[ SEM = \frac{\sigma}{\sqrt{n}} \]
Our test statistic, \(t\), is of the form
\[ t = \frac{estimate − hypothesis\ value}{standard\, error} \]
Since we don't know our population standard deviation, \(\sigma\), we use our estimate from the sampled data (\(s\)).
Our null hypothesis states we believe the true mean of the population we are sampling from, \(\mu\), is equal to some value, \(\mu_{0}\)
\[ H_{0}: \mu = \mu_{0} \]
Our test statistic then becomes:
\[ t = \frac{\bar{x} − \mu_{0}}{\frac{s}{\sqrt{n}}} \]
This statistic has been shown to follow a t-distribution with \(n - 1\) degrees of freedom. Our confidence intervals take the form:
\[ \bar{x} \pm t_{*} SE \]
RThese are not frequently used in practice. They test whether the mean of the sample differs from a reference value.
exprs <- rnorm(n = 5, mean = 11.1, sd = 0.84) t.test(exprs, mu = 10)
## ## One Sample t-test ## ## data: exprs ## t = 1.6251, df = 4, p-value = 0.1795 ## alternative hypothesis: true mean is not equal to 10 ## 95 percent confidence interval: ## 9.459006 12.068207 ## sample estimates: ## mean of x ## 10.76361
Two sample Student's t-test:
\[ H_{M} = H_{F} \\ H_{a} \neq 10 \]
sex <- c(rep('M', 5), rep('F', 5))
weight <- c(140, 180, 188, 210, 190, 120, 190, 129, 120, 130)
t.test(weight~sex, var.equal = TRUE)
## ## Two Sample t-test ## ## data: weight by sex ## t = -2.4982, df = 8, p-value = 0.03705 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -84.23077 -3.36923 ## sample estimates: ## mean in group F mean in group M ## 137.8 181.6
Welch's t-test relaxes the equal variances assumption by not using a pooled variance estimator and does not have any substantial disadvantages to Student's t-test. It is the default test in R and I would suggest you to use it in almost all situations.
\[ t \quad = \quad {\; \overline{X}_1 - \overline{X}_2 \; \over \sqrt{ \; {s_1^2 \over N_1} \; + \; {s_2^2 \over N_2} \quad }} \]
# welch's is default test in R t.test(weight~sex)
## ## Welch Two Sample t-test ## ## data: weight by sex ## t = -2.4982, df = 7.8515, p-value = 0.03758 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -84.36429 -3.23571 ## sample estimates: ## mean in group F mean in group M ## 137.8 181.6
Trace metals in drinking water affect the flavor and an unusually high concentration can pose a health hazard. Ten pairs of data were taken measuring zinc concentration in bottom water and surface water. Does the data suggest that the true average concentration in the bottom water exceeds that of surface water?
Click here to download the zinc data set from CourseConnect.
zinc <- read.delim('data/zinc_data.txt')
A paired t-test is just a one-sample t-test on the difference (\(D\)) of the paired observations and \(\mu_{0} = 0\). We would state our null and alternative hypotheses as follows:
\[ H_{0}: D = 0 \\ H_{a}: D \neq 0 \]
Our sample must be from an approximately normal distribution. Our sample size is small, so normality is difficult to assess with a histogram. A reasonably linear QQ-plot gives us confidence this assumption is met.
d <- zinc[['bottom']] - zinc[['surface']] par(mfrow = c(1, 2)) hist(d) qqnorm(d) qqline(d)
par(mfrow = c(1, 1))
Steps 3 and 4 are accomplished simultaneously by running t.test in R.
t.test(d, mu = 0)
## ## One Sample t-test ## ## data: d ## t = 4.8638, df = 9, p-value = 0.0008911 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 0.043006 0.117794 ## sample estimates: ## mean of x ## 0.0804
We can store the t-test object and extract information from it:
result <- t.test(d, mu = 0) result$statistic
## t ## 4.863813
result$p.value
## [1] 0.0008911155
result$estimate
## mean of x ## 0.0804
result$conf.int
## [1] 0.043006 0.117794 ## attr(,"conf.level") ## [1] 0.95
t.test has a paired option which will allow us to pass it two paired vectors and it will compute the difference for us:
t.test(x = zinc[['bottom']], y = zinc[['surface']], paired = TRUE)
## ## Paired t-test ## ## data: zinc[["bottom"]] and zinc[["surface"]] ## t = 4.8638, df = 9, p-value = 0.0008911 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## 0.043006 0.117794 ## sample estimates: ## mean of the differences ## 0.0804
There is sufficient statistical evidence (p = 0.0008) to conclude that, on average, bottom water zinc concentration is higher than surface water zinc concentration. Furthermore, we are 95% confident that bottom water, on average, had a concentration of 0.043 to 0.117 units of zinc higher than surface water.
State the conclusion of your test. If significant, include a confidence interval for interpretation.
What happens if we do not use a paired t-test:
t.test(x = zinc[['bottom']], y = zinc[['surface']], var.equal = TRUE)
## ## Two Sample t-test ## ## data: zinc[["bottom"]] and zinc[["surface"]] ## t = 1.2913, df = 18, p-value = 0.2129 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.05040823 0.21120823 ## sample estimates: ## mean of x mean of y ## 0.5649 0.4845